Temporal Action Segmentation with High-level Complex Activity Labels

نویسندگان

چکیده

The temporal action segmentation task segments videos temporally and predicts labels for all frames. Fully supervising such a model requires dense frame-wise annotations, which are expensive tedious to collect. This work is the first propose Constituent Action Discovery (CAD) framework that only video-wise high-level complex activity label as supervision segmentation. proposed approach automatically discovers constituent video actions using an classification task. Specifically, we define finite number of latent prototypes construct video-level dual representations with these learned collectively through training. setting endows our capability discover potentially shared across multiple activities. Due lack action-level supervision, adopt Hungarian matching algorithm relate ground truth semantic classes evaluation. We show can be extended from existing levels global level. global-level allows sharing activities, has never been considered in literature before. Extensive experiments demonstrate discovered help perform recognition tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

SOL: Segmentation with Overlapping Labels

Image segmentation is a fundamental problem in Computer Vision which involves segmenting an image into two or more segments. These segments usually correspond to objects of interest in the image, i.e. liver, kidney’s etc. The classic approach to this problem segments the image into mutually exclusive segments. However, this approach is not well-suited when segmenting overlapping objects, e.g. c...

متن کامل

Interactive Segmentation with Super-Labels

In interactive segmentation, the most common way to model object appearance is by GMM or histogram, while MRFs are used to encourage spatial coherence among the object labels. This makes the strong assumption that pixels within each object are i.i.d. when in fact most objects have multiple distinct appearances and exhibit strong spatial correlation among their pixels. At the very least, this ca...

متن کامل

semiautomatic image retrieval using the high level semantic labels

content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. the challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

High Level Fuzzy Labels for Vague Concepts

Vague or imprecise concepts are fundamental to natural language. Human beings are constantly using imprecise language to communicate each other. We usually say ‘John is tall and strong’ but not ‘John is exactly 1.85 meters in height and he can lift 100kg weights’. Humans have a remarkable capability to perform a wide variety of physical and mental tasks without any measurements. This capability...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2023

ISSN: ['1520-9210', '1941-0077']

DOI: https://doi.org/10.1109/tmm.2022.3231099